Think of ggplots like building layers of a cake. Each layer is added on top.
[1] Create a canvas defined by mapping to columns in your data
[2] Add 1 or more geometrics (geoms)
[3] Add formatting features. {Scales, Themes, Facets, etc}
Geom/Geometries: These define how your data looks on your plot
Formatting: These add customization to your plot to control wide-range of appearence options. Scales, Faceting, Position Adjustments, Labels, Legends, Themes are commonly customerized.
Enables Customization on Steroids ggplot2 is super flexible giving tons of options. The downside in this flexibility is that it takes a while to learn. Matt Dancho —-
For business reports the it is important to get the themes right and reported with same formated theme. __ Matt Dancho
The key to a good ggplot is knowing how to format the data for a ggplot.
unlike base graphics, ggplot works with data.frames and not individual vectors.
revenue_by_year_tbl <- bike_orderlines_tbl %>%
select(order_date, total_price) %>%
mutate(year = year(order_date)) %>%
group_by(year) %>%
summarize(revenue = sum(total_price)) %>%
ungroup()
revenue_by_year_tbl
## # A tibble: 5 × 2
## year revenue
## <dbl> <dbl>
## 1 2011 11292885
## 2 2012 12163075
## 3 2013 16480775
## 4 2014 13924085
## 5 2015 17171510
Mapping: Connects data columns to ggplot aesthetics
GGplot structure
Step 1: Build Canvas: Involves mapping columns in data to ggplot() aesthetics (x, y, color, fill, size etc) via aes() function
Step 2: Geometries: 2nd Layer that generates a visual depiction of the data using a geometry type (e.g. Line Plot)
revenue_by_year_tbl %>%
# Canvas
ggplot(aes(x = year, y = revenue, color = revenue)) +
# Geometries
geom_line(size = 1) +
# aesthetics specifically targeting certain geometries
geom_point(aes(size = revenue))
Scale Color: Enables customizing the color aesthetic (mapped to revenue in this case)
Scale X & Y: Enables customizing the x-axis and y-axis (mapped to year and revenue in this case)
Labels: Changes the text for title, subtitle, x, y, legends & captions
Themes: Usually we start with a base theme e.g. theme_bw() and then modify with the theme() function
g <- revenue_by_year_tbl %>%
# Canvas
ggplot(aes(x = year, y = revenue, color = revenue)) +
# Geometries
geom_line(size = 1) +
geom_point(size = 5) +
geom_smooth(method = "lm", se = FALSE) +
# Formatting
expand_limits(y = 0) +
scale_color_continuous(low = "red", high = "black",
labels = scales::dollar_format(scale = 1/1e6, suffix = "M")) +
scale_y_continuous(labels = scales::dollar_format(scale = 1/1e6, suffix = "M")) +
labs(
title = "Revenue",
subtitle = "Sales are trending up and to the right!",
x = "",
y = "Sales (Millions)",
color = "Rev ($M)",
caption = "What's happening?\nSales numbers showing year-over-year growth."
) +
theme_bw() +
theme(legend.position = "right", legend.direction = "vertical")
Key Concept: The ggplot object is just a list that captures layers, scales, mappings, theme, coordinates, and labels that you customize
g
## `geom_smooth()` using formula 'y ~ x'
bike_orderlines_tbl <- read_rds("00_data/bike_sales/data_wrangled/bike_orderlines.rds")
glimpse(bike_orderlines_tbl)
## Rows: 15,644
## Columns: 13
## $ order_date <dttm> 2011-01-07, 2011-01-07, 2011-01-10, 2011-01-10, 2011-0…
## $ order_id <dbl> 1, 1, 2, 2, 3, 3, 3, 3, 3, 4, 5, 5, 5, 5, 6, 6, 6, 6, 7…
## $ order_line <dbl> 1, 2, 1, 2, 1, 2, 3, 4, 5, 1, 1, 2, 3, 4, 1, 2, 3, 4, 1…
## $ quantity <dbl> 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 2, 1, 1, 1, 1, 1, 1, 1…
## $ price <dbl> 6070, 5970, 2770, 5970, 10660, 3200, 12790, 5330, 1570,…
## $ total_price <dbl> 6070, 5970, 2770, 5970, 10660, 3200, 12790, 5330, 1570,…
## $ model <chr> "Jekyll Carbon 2", "Trigger Carbon 2", "Beast of the Ea…
## $ category_1 <chr> "Mountain", "Mountain", "Mountain", "Mountain", "Road",…
## $ category_2 <chr> "Over Mountain", "Over Mountain", "Trail", "Over Mounta…
## $ frame_material <chr> "Carbon", "Carbon", "Aluminum", "Carbon", "Carbon", "Ca…
## $ bikeshop_name <chr> "Ithaca Mountain Climbers", "Ithaca Mountain Climbers",…
## $ city <chr> "Ithaca", "Ithaca", "Kansas City", "Kansas City", "Loui…
## $ state <chr> "NY", "NY", "KS", "KS", "KY", "KY", "KY", "KY", "KY", "…
Great for Continuous vs Continuous
Also good for Lollipop Charts (more on this in advanced plots)
Goal: Explain relationship between order value and quantity of bikes sold
# Data Manipulation
order_value_tbl <- bike_orderlines_tbl %>%
select(order_id, order_line, total_price, quantity) %>%
group_by(order_id) %>%
summarise(
total_quantity = sum(quantity),
total_price = sum(total_price)
) %>%
ungroup()
# Scatter Plot
order_value_tbl %>%
ggplot(aes(x = total_quantity, y = total_price)) +
# geometries
geom_point(alpha = 0.5, size = 2) +
# uses spine (default) y ~ s(x, bs = "cs")
# change method to 'lm'
geom_smooth(method = 'lm', se = FALSE)
## `geom_smooth()` using formula 'y ~ x'
Great for time series
Goal: Describe revenue by Month, expose cyclic nature
# Data Manipulation
revenue_by_month_tbl <- bike_orderlines_tbl %>%
select(order_date, total_price) %>%
mutate(year_month = floor_date(order_date, "months") %>% ymd()) %>%
group_by(year_month) %>%
summarise(revenue = sum(total_price)) %>%
ungroup()
# Line Plot
revenue_by_month_tbl %>%
ggplot(aes(x = year_month, y = revenue)) +
geom_line(size = 0.5, linetype = 1) +
geom_smooth(method = 'loess', span = 0.2)
## `geom_smooth()` using formula 'y ~ x'
# Data Manipulation
revenue_by_category_2 <- bike_orderlines_tbl %>%
select(category_2, total_price) %>%
group_by(category_2) %>%
summarise(revenue = sum(total_price)) %>%
ungroup()
# Bar Plot
revenue_by_category_2 %>%
mutate(category_2 = category_2 %>% fct_reorder(revenue)) %>%
ggplot(aes(x = category_2, y = revenue)) +
geom_col(fill = palette_dark()[6]) +
coord_flip()
There are two types of bar charts: geom_bar() and geom_col().
geom_bar() makes the height of the bar proportional to the number of cases in each group (or if the weight aesthetic is supplied, the sum of the weights). If you want the heights of the bars to represent values in the data, use geom_col() instead geom_bar() uses stat_count() by default: it counts the number of cases at each x position
summary:
Great for inspecting the distribution of a variable
Goal: Unit price of bicycles
# Histogram
bike_orderlines_tbl %>%
distinct(model, price) %>%
ggplot(aes(price)) +
geom_histogram(bins = 20, color = "white", fill = "blue", alpha = 0.5)
# Goal: Unit price of bicylce, segmenting by frame material
# Histogram
bike_orderlines_tbl %>%
distinct(model, price, frame_material) %>%
ggplot(aes(price, fill = frame_material)) +
geom_histogram() +
facet_wrap(~frame_material, ncol = 1) +
scale_fill_tq() +
theme_tq()
## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
# Density
bike_orderlines_tbl %>%
distinct(model, price, frame_material) %>%
ggplot(aes(price, fill = frame_material)) +
geom_density(alpha = 0.5) +
scale_fill_tq() +
theme_tq()
Great for comparing distributions
Goal: Unit price of models, segmenting by category 2
# Data Manipulation
unit_price_by_cat_2_tbl <- bike_orderlines_tbl %>%
select(category_2, model, price) %>%
distinct() %>%
mutate(category_2 = as_factor(category_2) %>% fct_reorder(price))
# Box Plot
unit_price_by_cat_2_tbl %>%
ggplot(aes(category_2, price)) +
geom_boxplot() +
coord_flip() +
theme_tq()
unit_price_by_cat_2_tbl %>%
ggplot(aes(category_2, price)) +
geom_jitter(width = 0.1, color = "#2c3e50") +
geom_violin(alpha = 0.5) +
coord_flip() +
theme_tq()
# Data Manipulation
revenue_by_year_tbl <- bike_orderlines_tbl %>%
select(order_date, total_price) %>%
mutate(year = year(order_date)) %>%
group_by(year) %>%
summarise(revenue = sum(total_price)) %>%
ungroup()
revenue_by_year_tbl %>%
ggplot(aes(x = year, y = revenue)) +
geom_col(fill="#2c3e50") +
geom_text(aes(label = scales::dollar(revenue, scale = 1e-6, suffix = "M")), vjust = 1.5, color = "white")
revenue_by_year_tbl %>%
mutate(revenue_text = scales::dollar(revenue, scale = 1e-6, suffix = "M")) %>%
ggplot(aes(x = year, y = revenue)) +
geom_col(fill="#2c3e50") +
geom_text(aes(label = revenue_text), vjust = 1.5, color = "white") +
expand_limits(y = 2e7) +
theme_tq()
revenue_by_year_tbl %>%
mutate(revenue_text = scales::dollar(revenue, scale = 1e-6, suffix = "M")) %>%
ggplot(aes(x = year, y = revenue)) +
geom_col(fill="#2c3e50") +
geom_label(aes(label = revenue_text), vjust = 0.6, size = 5) +
expand_limits(y = 2e7) +
theme_tq()
revenue_by_year_tbl %>%
mutate(revenue_text = scales::dollar(revenue, scale = 1e-6, suffix = "M")) %>%
ggplot(aes(x = year, y = revenue)) +
geom_col(fill="#2c3e50") +
geom_label(aes(label = revenue_text), vjust = 0.05, size = 5) +
expand_limits(y = 2e7) +
theme_tq()
revenue_by_year_tbl %>%
mutate(revenue_text = scales::dollar(revenue, scale = 1e-6, suffix = "M")) %>%
ggplot(aes(x = year, y = revenue)) +
geom_col(fill="#2c3e50") +
geom_smooth(method = "lm", se = FALSE) +
geom_text(aes(label = revenue_text), vjust = 1.5, color = "white") +
geom_label(label = "Major Demand This Year",
vjust = -0.5,
size = 5,
fontface = "bold",
fill = palette_light()[4],
data = revenue_by_year_tbl %>%
filter(year %in% c(2013))) +
expand_limits(y = 2e7) +
theme_tq()
## `geom_smooth()` using formula 'y ~ x'
fct_reorder : reorder by one axis (e.g. revenue)
fct_reorder2 : reorder by two axis (e.g. year, revenue)
sales_by_year_category_2_tbl <- bike_orderlines_tbl %>%
select(order_date, category_2, total_price) %>%
mutate(order_date = ymd(order_date)) %>%
mutate(year = year(order_date)) %>%
group_by(category_2, year) %>%
summarize(revenue = sum(total_price)) %>%
ungroup() %>%
mutate(category_2 = fct_reorder2(category_2, year, revenue))
sales_by_year_category_2_tbl
## # A tibble: 45 × 3
## category_2 year revenue
## <fct> <dbl> <dbl>
## 1 Cross Country Race 2011 2917250
## 2 Cross Country Race 2012 3360800
## 3 Cross Country Race 2013 4315430
## 4 Cross Country Race 2014 3691780
## 5 Cross Country Race 2015 4939370
## 6 Cyclocross 2011 378980
## 7 Cyclocross 2012 342090
## 8 Cyclocross 2013 503580
## 9 Cyclocross 2014 390250
## 10 Cyclocross 2015 493220
## # … with 35 more rows
sales_by_year_category_2_tbl %>%
mutate(category_2_num = as.numeric(category_2)) %>%
arrange(category_2_num)
## # A tibble: 45 × 4
## category_2 year revenue category_2_num
## <fct> <dbl> <dbl> <dbl>
## 1 Cross Country Race 2011 2917250 1
## 2 Cross Country Race 2012 3360800 1
## 3 Cross Country Race 2013 4315430 1
## 4 Cross Country Race 2014 3691780 1
## 5 Cross Country Race 2015 4939370 1
## 6 Elite Road 2011 2493315 2
## 7 Elite Road 2012 2637935 2
## 8 Elite Road 2013 3394210 2
## 9 Elite Road 2014 3170125 2
## 10 Elite Road 2015 3639080 2
## # … with 35 more rows
It is important to be comfrontable working with colors
colours()
## [1] "white" "aliceblue" "antiquewhite"
## [4] "antiquewhite1" "antiquewhite2" "antiquewhite3"
## [7] "antiquewhite4" "aquamarine" "aquamarine1"
## [10] "aquamarine2" "aquamarine3" "aquamarine4"
## [13] "azure" "azure1" "azure2"
## [16] "azure3" "azure4" "beige"
## [19] "bisque" "bisque1" "bisque2"
## [22] "bisque3" "bisque4" "black"
## [25] "blanchedalmond" "blue" "blue1"
## [28] "blue2" "blue3" "blue4"
## [31] "blueviolet" "brown" "brown1"
## [34] "brown2" "brown3" "brown4"
## [37] "burlywood" "burlywood1" "burlywood2"
## [40] "burlywood3" "burlywood4" "cadetblue"
## [43] "cadetblue1" "cadetblue2" "cadetblue3"
## [46] "cadetblue4" "chartreuse" "chartreuse1"
## [49] "chartreuse2" "chartreuse3" "chartreuse4"
## [52] "chocolate" "chocolate1" "chocolate2"
## [55] "chocolate3" "chocolate4" "coral"
## [58] "coral1" "coral2" "coral3"
## [61] "coral4" "cornflowerblue" "cornsilk"
## [64] "cornsilk1" "cornsilk2" "cornsilk3"
## [67] "cornsilk4" "cyan" "cyan1"
## [70] "cyan2" "cyan3" "cyan4"
## [73] "darkblue" "darkcyan" "darkgoldenrod"
## [76] "darkgoldenrod1" "darkgoldenrod2" "darkgoldenrod3"
## [79] "darkgoldenrod4" "darkgray" "darkgreen"
## [82] "darkgrey" "darkkhaki" "darkmagenta"
## [85] "darkolivegreen" "darkolivegreen1" "darkolivegreen2"
## [88] "darkolivegreen3" "darkolivegreen4" "darkorange"
## [91] "darkorange1" "darkorange2" "darkorange3"
## [94] "darkorange4" "darkorchid" "darkorchid1"
## [97] "darkorchid2" "darkorchid3" "darkorchid4"
## [100] "darkred" "darksalmon" "darkseagreen"
## [103] "darkseagreen1" "darkseagreen2" "darkseagreen3"
## [106] "darkseagreen4" "darkslateblue" "darkslategray"
## [109] "darkslategray1" "darkslategray2" "darkslategray3"
## [112] "darkslategray4" "darkslategrey" "darkturquoise"
## [115] "darkviolet" "deeppink" "deeppink1"
## [118] "deeppink2" "deeppink3" "deeppink4"
## [121] "deepskyblue" "deepskyblue1" "deepskyblue2"
## [124] "deepskyblue3" "deepskyblue4" "dimgray"
## [127] "dimgrey" "dodgerblue" "dodgerblue1"
## [130] "dodgerblue2" "dodgerblue3" "dodgerblue4"
## [133] "firebrick" "firebrick1" "firebrick2"
## [136] "firebrick3" "firebrick4" "floralwhite"
## [139] "forestgreen" "gainsboro" "ghostwhite"
## [142] "gold" "gold1" "gold2"
## [145] "gold3" "gold4" "goldenrod"
## [148] "goldenrod1" "goldenrod2" "goldenrod3"
## [151] "goldenrod4" "gray" "gray0"
## [154] "gray1" "gray2" "gray3"
## [157] "gray4" "gray5" "gray6"
## [160] "gray7" "gray8" "gray9"
## [163] "gray10" "gray11" "gray12"
## [166] "gray13" "gray14" "gray15"
## [169] "gray16" "gray17" "gray18"
## [172] "gray19" "gray20" "gray21"
## [175] "gray22" "gray23" "gray24"
## [178] "gray25" "gray26" "gray27"
## [181] "gray28" "gray29" "gray30"
## [184] "gray31" "gray32" "gray33"
## [187] "gray34" "gray35" "gray36"
## [190] "gray37" "gray38" "gray39"
## [193] "gray40" "gray41" "gray42"
## [196] "gray43" "gray44" "gray45"
## [199] "gray46" "gray47" "gray48"
## [202] "gray49" "gray50" "gray51"
## [205] "gray52" "gray53" "gray54"
## [208] "gray55" "gray56" "gray57"
## [211] "gray58" "gray59" "gray60"
## [214] "gray61" "gray62" "gray63"
## [217] "gray64" "gray65" "gray66"
## [220] "gray67" "gray68" "gray69"
## [223] "gray70" "gray71" "gray72"
## [226] "gray73" "gray74" "gray75"
## [229] "gray76" "gray77" "gray78"
## [232] "gray79" "gray80" "gray81"
## [235] "gray82" "gray83" "gray84"
## [238] "gray85" "gray86" "gray87"
## [241] "gray88" "gray89" "gray90"
## [244] "gray91" "gray92" "gray93"
## [247] "gray94" "gray95" "gray96"
## [250] "gray97" "gray98" "gray99"
## [253] "gray100" "green" "green1"
## [256] "green2" "green3" "green4"
## [259] "greenyellow" "grey" "grey0"
## [262] "grey1" "grey2" "grey3"
## [265] "grey4" "grey5" "grey6"
## [268] "grey7" "grey8" "grey9"
## [271] "grey10" "grey11" "grey12"
## [274] "grey13" "grey14" "grey15"
## [277] "grey16" "grey17" "grey18"
## [280] "grey19" "grey20" "grey21"
## [283] "grey22" "grey23" "grey24"
## [286] "grey25" "grey26" "grey27"
## [289] "grey28" "grey29" "grey30"
## [292] "grey31" "grey32" "grey33"
## [295] "grey34" "grey35" "grey36"
## [298] "grey37" "grey38" "grey39"
## [301] "grey40" "grey41" "grey42"
## [304] "grey43" "grey44" "grey45"
## [307] "grey46" "grey47" "grey48"
## [310] "grey49" "grey50" "grey51"
## [313] "grey52" "grey53" "grey54"
## [316] "grey55" "grey56" "grey57"
## [319] "grey58" "grey59" "grey60"
## [322] "grey61" "grey62" "grey63"
## [325] "grey64" "grey65" "grey66"
## [328] "grey67" "grey68" "grey69"
## [331] "grey70" "grey71" "grey72"
## [334] "grey73" "grey74" "grey75"
## [337] "grey76" "grey77" "grey78"
## [340] "grey79" "grey80" "grey81"
## [343] "grey82" "grey83" "grey84"
## [346] "grey85" "grey86" "grey87"
## [349] "grey88" "grey89" "grey90"
## [352] "grey91" "grey92" "grey93"
## [355] "grey94" "grey95" "grey96"
## [358] "grey97" "grey98" "grey99"
## [361] "grey100" "honeydew" "honeydew1"
## [364] "honeydew2" "honeydew3" "honeydew4"
## [367] "hotpink" "hotpink1" "hotpink2"
## [370] "hotpink3" "hotpink4" "indianred"
## [373] "indianred1" "indianred2" "indianred3"
## [376] "indianred4" "ivory" "ivory1"
## [379] "ivory2" "ivory3" "ivory4"
## [382] "khaki" "khaki1" "khaki2"
## [385] "khaki3" "khaki4" "lavender"
## [388] "lavenderblush" "lavenderblush1" "lavenderblush2"
## [391] "lavenderblush3" "lavenderblush4" "lawngreen"
## [394] "lemonchiffon" "lemonchiffon1" "lemonchiffon2"
## [397] "lemonchiffon3" "lemonchiffon4" "lightblue"
## [400] "lightblue1" "lightblue2" "lightblue3"
## [403] "lightblue4" "lightcoral" "lightcyan"
## [406] "lightcyan1" "lightcyan2" "lightcyan3"
## [409] "lightcyan4" "lightgoldenrod" "lightgoldenrod1"
## [412] "lightgoldenrod2" "lightgoldenrod3" "lightgoldenrod4"
## [415] "lightgoldenrodyellow" "lightgray" "lightgreen"
## [418] "lightgrey" "lightpink" "lightpink1"
## [421] "lightpink2" "lightpink3" "lightpink4"
## [424] "lightsalmon" "lightsalmon1" "lightsalmon2"
## [427] "lightsalmon3" "lightsalmon4" "lightseagreen"
## [430] "lightskyblue" "lightskyblue1" "lightskyblue2"
## [433] "lightskyblue3" "lightskyblue4" "lightslateblue"
## [436] "lightslategray" "lightslategrey" "lightsteelblue"
## [439] "lightsteelblue1" "lightsteelblue2" "lightsteelblue3"
## [442] "lightsteelblue4" "lightyellow" "lightyellow1"
## [445] "lightyellow2" "lightyellow3" "lightyellow4"
## [448] "limegreen" "linen" "magenta"
## [451] "magenta1" "magenta2" "magenta3"
## [454] "magenta4" "maroon" "maroon1"
## [457] "maroon2" "maroon3" "maroon4"
## [460] "mediumaquamarine" "mediumblue" "mediumorchid"
## [463] "mediumorchid1" "mediumorchid2" "mediumorchid3"
## [466] "mediumorchid4" "mediumpurple" "mediumpurple1"
## [469] "mediumpurple2" "mediumpurple3" "mediumpurple4"
## [472] "mediumseagreen" "mediumslateblue" "mediumspringgreen"
## [475] "mediumturquoise" "mediumvioletred" "midnightblue"
## [478] "mintcream" "mistyrose" "mistyrose1"
## [481] "mistyrose2" "mistyrose3" "mistyrose4"
## [484] "moccasin" "navajowhite" "navajowhite1"
## [487] "navajowhite2" "navajowhite3" "navajowhite4"
## [490] "navy" "navyblue" "oldlace"
## [493] "olivedrab" "olivedrab1" "olivedrab2"
## [496] "olivedrab3" "olivedrab4" "orange"
## [499] "orange1" "orange2" "orange3"
## [502] "orange4" "orangered" "orangered1"
## [505] "orangered2" "orangered3" "orangered4"
## [508] "orchid" "orchid1" "orchid2"
## [511] "orchid3" "orchid4" "palegoldenrod"
## [514] "palegreen" "palegreen1" "palegreen2"
## [517] "palegreen3" "palegreen4" "paleturquoise"
## [520] "paleturquoise1" "paleturquoise2" "paleturquoise3"
## [523] "paleturquoise4" "palevioletred" "palevioletred1"
## [526] "palevioletred2" "palevioletred3" "palevioletred4"
## [529] "papayawhip" "peachpuff" "peachpuff1"
## [532] "peachpuff2" "peachpuff3" "peachpuff4"
## [535] "peru" "pink" "pink1"
## [538] "pink2" "pink3" "pink4"
## [541] "plum" "plum1" "plum2"
## [544] "plum3" "plum4" "powderblue"
## [547] "purple" "purple1" "purple2"
## [550] "purple3" "purple4" "red"
## [553] "red1" "red2" "red3"
## [556] "red4" "rosybrown" "rosybrown1"
## [559] "rosybrown2" "rosybrown3" "rosybrown4"
## [562] "royalblue" "royalblue1" "royalblue2"
## [565] "royalblue3" "royalblue4" "saddlebrown"
## [568] "salmon" "salmon1" "salmon2"
## [571] "salmon3" "salmon4" "sandybrown"
## [574] "seagreen" "seagreen1" "seagreen2"
## [577] "seagreen3" "seagreen4" "seashell"
## [580] "seashell1" "seashell2" "seashell3"
## [583] "seashell4" "sienna" "sienna1"
## [586] "sienna2" "sienna3" "sienna4"
## [589] "skyblue" "skyblue1" "skyblue2"
## [592] "skyblue3" "skyblue4" "slateblue"
## [595] "slateblue1" "slateblue2" "slateblue3"
## [598] "slateblue4" "slategray" "slategray1"
## [601] "slategray2" "slategray3" "slategray4"
## [604] "slategrey" "snow" "snow1"
## [607] "snow2" "snow3" "snow4"
## [610] "springgreen" "springgreen1" "springgreen2"
## [613] "springgreen3" "springgreen4" "steelblue"
## [616] "steelblue1" "steelblue2" "steelblue3"
## [619] "steelblue4" "tan" "tan1"
## [622] "tan2" "tan3" "tan4"
## [625] "thistle" "thistle1" "thistle2"
## [628] "thistle3" "thistle4" "tomato"
## [631] "tomato1" "tomato2" "tomato3"
## [634] "tomato4" "turquoise" "turquoise1"
## [637] "turquoise2" "turquoise3" "turquoise4"
## [640] "violet" "violetred" "violetred1"
## [643] "violetred2" "violetred3" "violetred4"
## [646] "wheat" "wheat1" "wheat2"
## [649] "wheat3" "wheat4" "whitesmoke"
## [652] "yellow" "yellow1" "yellow2"
## [655] "yellow3" "yellow4" "yellowgreen"
sales_by_year_category_2_tbl %>%
ggplot(aes(x = year, y = revenue)) +
geom_col(fill = "slateblue2")
(e.g. White = 255 - 255 - 255)
col2rgb("slateblue2")
## [,1]
## red 122
## green 103
## blue 238
col2rgb("#2C3E50")
## [,1]
## red 44
## green 62
## blue 80
rgb(44, 62, 80, maxColorValue = 225)
## [1] "#32465B"
tidyquant
tidyquant::palette_light()[1] %>% col2rgb()
## blue
## red 44
## green 62
## blue 80
Brewer : for discrete data
RColorBrewer::display.brewer.all()
RColorBrewer::brewer.pal.info %>% arrange(desc(maxcolors))
## maxcolors category colorblind
## Paired 12 qual TRUE
## Set3 12 qual FALSE
## BrBG 11 div TRUE
## PiYG 11 div TRUE
## PRGn 11 div TRUE
## PuOr 11 div TRUE
## RdBu 11 div TRUE
## RdGy 11 div FALSE
## RdYlBu 11 div TRUE
## RdYlGn 11 div FALSE
## Spectral 11 div FALSE
## Pastel1 9 qual FALSE
## Set1 9 qual FALSE
## Blues 9 seq TRUE
## BuGn 9 seq TRUE
## BuPu 9 seq TRUE
## GnBu 9 seq TRUE
## Greens 9 seq TRUE
## Greys 9 seq TRUE
## Oranges 9 seq TRUE
## OrRd 9 seq TRUE
## PuBu 9 seq TRUE
## PuBuGn 9 seq TRUE
## PuRd 9 seq TRUE
## Purples 9 seq TRUE
## RdPu 9 seq TRUE
## Reds 9 seq TRUE
## YlGn 9 seq TRUE
## YlGnBu 9 seq TRUE
## YlOrBr 9 seq TRUE
## YlOrRd 9 seq TRUE
## Accent 8 qual FALSE
## Dark2 8 qual TRUE
## Pastel2 8 qual FALSE
## Set2 8 qual TRUE
RColorBrewer::brewer.pal(n = 100, name = "Blues")
## Warning in RColorBrewer::brewer.pal(n = 100, name = "Blues"): n too large, allowed maximum for palette Blues is 9
## Returning the palette you asked for with that many colors
## [1] "#F7FBFF" "#DEEBF7" "#C6DBEF" "#9ECAE1" "#6BAED6" "#4292C6" "#2171B5"
## [8] "#08519C" "#08306B"
sales_by_year_category_2_tbl %>%
ggplot(aes(x = year, y = revenue)) +
geom_col(fill = RColorBrewer::brewer.pal(n = 100, name = "Blues")[3])
## Warning in RColorBrewer::brewer.pal(n = 100, name = "Blues"): n too large, allowed maximum for palette Blues is 9
## Returning the palette you asked for with that many colors
Viridis :
viridisLite::viridis(n = 20)
## [1] "#440154FF" "#481568FF" "#482677FF" "#453781FF" "#3F4788FF" "#39558CFF"
## [7] "#32648EFF" "#2D718EFF" "#287D8EFF" "#238A8DFF" "#1F968BFF" "#20A386FF"
## [13] "#29AF7FFF" "#3CBC75FF" "#56C667FF" "#74D055FF" "#94D840FF" "#B8DE29FF"
## [19] "#DCE318FF" "#FDE725FF"
sales_by_year_category_2_tbl %>%
ggplot(aes(x = year, y = revenue)) +
geom_col(fill = viridisLite::viridis(n = 20)[5])
Used with line nd points, Outlines of rectangular objects
ggplot2 data format & Modeling data format are the same!
“Tidy Data”: One column of interest known as the target (e.g. target = revenue)
Other columns describe the target
sales_by_year_category_2_tbl %>%
ggplot(aes(year, revenue, color = category_2)) + # define Globally
geom_line() +
geom_point()
sales_by_year_category_2_tbl %>%
ggplot(aes(year, revenue)) +
geom_line(aes(color = category_2)) + # Define locally
geom_point()
sales_by_year_category_2_tbl %>%
ggplot(aes(year, revenue)) +
geom_line(aes(color = category_2)) +
geom_point(color = "dodgerblue", size = 2)
Info: * Used with fill of rectangular objects
sales_by_year_category_2_tbl %>%
ggplot(aes(year, revenue)) + # define Globally
geom_col()
sales_by_year_category_2_tbl %>%
ggplot(aes(year, revenue, fill = category_2)) + # define Globally
geom_col()
# Do not confuse fill with colour!!!
sales_by_year_category_2_tbl %>%
ggplot(aes(year, revenue, color = category_2)) + # define Globally
geom_col()
Info :
sales_by_year_category_2_tbl %>%
ggplot(aes(year, revenue)) +
geom_line(aes(colour = category_2), size = 1) +
geom_point(aes(size = revenue))
sales_by_year_category_2_tbl %>%
ggplot(aes(year, revenue, size = revenue)) +
geom_line(aes(colour = category_2)) +
geom_point()
sales_by_year_category_2_tbl %>%
ggplot(aes(year, revenue, size = revenue)) +
geom_line(aes(colour = category_2), size = 1) +
geom_point()
Great way to tease out variation by category
Goal: Sales annual sales by category 2
sales_by_year_category_2_tbl %>%
ggplot(aes(year, revenue, colour = category_2)) +
geom_line(colour = "black") +
geom_smooth(method = "lm", se = FALSE) +
facet_wrap(~ category_2, ncol = 3, scales = "free_y") +
expand_limits(y = 0)
## `geom_smooth()` using formula 'y ~ x'
# Stacked Bars & Side-By-Side Bars
sales_by_year_category_2_tbl %>%
ggplot(aes(year, revenue, fill = category_2)) +
# geom_col(position = "stack") +
# geom_col(position = "dodge") +
geom_col(position = position_dodge(width = 0.9), color = "white")
# Stacked Area
sales_by_year_category_2_tbl %>%
ggplot(aes(year, revenue, fill = category_2)) +
geom_area(colour = "black")
Plot 1: Faceted Plot, Colour = Continous Scale
g_facet_continous <- sales_by_year_category_2_tbl %>%
ggplot(aes(year, revenue, colour =revenue)) +
geom_line(size = 1) +
geom_point(size = 3) +
facet_wrap(~ category_2, scales = "free_y") +
expand_limits(y = 0) +
theme_minimal()
g_facet_continous
Plot 2: Faceted Plot, Colour = Discrete Scale
g_facet_discrete <- sales_by_year_category_2_tbl %>%
ggplot(aes(year, revenue, colour = category_2)) +
geom_line(size = 1) +
geom_point(size = 3) +
facet_wrap(~category_2, scales = "free_y") +
expand_limits(y = 0)
g_facet_discrete
Plot 2: Stacked Area plot
g_area_discrete <- sales_by_year_category_2_tbl %>%
ggplot(aes(year, revenue, fill = category_2)) +
geom_area(colour = "black") +
theme_minimal()
g_area_discrete
colour by Revenue (continuous Scale): adjusting colour gradient
g_facet_continous +
scale_color_continuous(
low = "cornflowerblue",
high = "black"
)
g_facet_continous +
scale_color_viridis_c(alpha = 0.7)
colour by Category 2: discrete Scale
g_facet_discrete +
scale_color_viridis_d()
g_facet_discrete +
scale_color_brewer(palette = "PuRd") +
theme_dark()
# RColorBrewer::display.brewer.all()
Fill by Category 2
RColorBrewer::display.brewer.all()
g_area_discrete +
scale_fill_brewer(palette = "Set3")
g_area_discrete +
scale_fill_viridis_d()
g_area_discrete +
scale_fill_tq()
g_facet_continous +
scale_color_viridis_c(alpha = 0.8) +
# gives more room to breath on the x-axis
scale_x_continuous(breaks = seq(2011, 2015, by = 2)) +
scale_y_continuous(labels = scales::dollar_format(scale = 1e-6, suffix = "M")) +
theme_minimal()
g_facet_continous +
scale_x_continuous(breaks = seq(2011, 2015, by = 2)) +
scale_y_continuous(labels = scales::dollar_format(scale = 1e-6, suffix = "M")) +
geom_smooth(method = "lm", se = FALSE, color = "black") +
scale_color_viridis_c() +
theme_dark() +
labs(
title = "Bike Sales",
subtitle = "Sales are trending up",
caption = "5 year sales trends\ncomes from our RP Data base",
x = "Year",
y = "Revenue ($M)",
colour = "Revenue"
)
## `geom_smooth()` using formula 'y ~ x'
theme_light(): Pre-set theme theme(): The function used to adjust every theme element that is part of a ggplot object
g_facet_continous +
theme_light() +
theme(
axis.text.x = element_text(
angle = 45,
hjust = 1
),
strip.background = element_rect(
colour = "black",
fill = "cornflowerblue",
size = 1
),
strip.text = element_text(
face = "bold",
colour = "White"
)
)
10.0 Putting it All Together
sales_by_year_category_2_tbl %>%
ggplot(aes(year, revenue, fill = category_2)) +
geom_area(colour = "white") +
scale_fill_brewer("Blues", direction = -1) +
scale_y_continuous(label = scales::dollar_format(scale = 10e-6, suffix = "M")) +
labs(
title = "Sales Over Year by Category 2",
subtitle = "Sales Trending Up",
caption = "Bike Sales trends look strong heading into 2016",
x = "",
y = "Revenue ($M)",
fill = "2nd Category"
) +
theme_minimal() +
theme(
title = element_text(face = "bold", colour = "#08306B")
)